Geospatial Analysis

Geospatial analysis is an approach to applying statistical analysis and other analytic techniques to data which has a geographical or spatial aspect. Such analysis would typically employ software capable of rendering maps processing spatial data, and applying analytical methods to terrestrial or geographic datasets, including the use of geographic information systems and geomatics

This notebook covers the following exciting features:

  1. Automate geospatial analysis workflows using Python
  2. Create choropleth maps with Python tools such as matplotlib and plotly.

Task 2.1

Importing Libraries

A list of packages and libraries is imported :

  1. Pandas - Open source library providing high-performance, easy-to-use data structures and data analysis tools.
  2. Matplotlib - Python 2D plotting library.
  3. GeoPandas - A powerful package for spatial manipulation.
  4. Seaborn - A Python data visualization library based on matplotlib.

Importing Data

Urban Population and Total Population CSV files are imported with the help of pandas ".read_csv" function to read and stored the file into a pandas dataframe.

Importing Urban Population Data

Importing Total Population Data

Data Cleaning

The ".shape" function of pandas gives the number of rows and columns of the dataframe. The Urban population data frame has initially 264 rows and 65 columns

Checking Null Values in Urban Dataset with the help of heatmap and pandas ".isnull()" function.

The ".isnull()" function tells the null values in the datasets

The yellow lines in the above heatmap indicates the null values in the dataset. We got 553 null values in the dataset which we have to clean further with the help of ".dropna" function.

The ".dropna" function drops the particular rows and columns which have null values by setting "axis=1" for columns and "axis=0" for rows.

After using the ".dropna" function we have now zero null values in our urban population dataset.

After cleaning, the rows and the columns has changed to 256 and 64 respectively.

Checking the shape of Total Population Dataset

Checking Null Values in Total Population Dataset with the help of heatmap and pandas ".isnull()" function.

We got 443 null values in the dataset which we have to clean further with the help of ".dropna" function.

After using the ".dropna" function we have now zero null values in our total population dataset.

After cleaning, the rows and the columns has changed to 258 and 64 respectively.

Importing world inbuilt dataset of geopandas

Selecting only "iso_a3" and "geometry" columns from world inbuilt geopandas dataset as only these columns will be used further

The name of the column "iso_a3" in the world dataset has changed to "Country Code" as this column helps in merging the world dataset with both the urban population dataset and total population dataset.

Merging Urban population data frame with world geodataframe

The urban popluation dataframe is merged with the world geodataframe with the help of pandas ".merge" function.

Merging total population data frame with world geodataframe

The total popluation dataframe is merged with the world geodataframe with the help of pandas ".merge" function.

Cleaning Merged Dataframe

Checking type of Data frame

Changing pandas data frame to geopandas data frame

Urban population pandas dataframe is changed to geopandas dataframe with the help of "gpd.geodataframe" function and the geometry column in the dataframe.

Total population pandas dataframe is changed to geopandas dataframe with the help of "gpd.geodataframe" function and the geometry column in the dataframe.

Saving the data to .shp file (shape file) with the help of "to_file" function

Reading .shp files

Spatial data file of urban population and total population are readed with geopandas using ".read_file" function.

CRS and Projections

The Coordinate Reference System or CRS of a spatial object tells Python where the raster is located in geographic space. It also tells Python what mathematical method should be used to “flatten” or project the raster in geographic space.

Checking the CRS of Urban Population data
Changing the CRS of Urban population geodata to "epsg=3395" which helps in flatten the geographical plot.
Checking the CRS of Total Population data
Changing the CRS of Total population geodata to "epsg=3395" which helps in flatten the geographical plot.

Solution of Task 2.1

2.1.1 World urban population per capita for the year 1990.

We can calculate urban population per capita by dividing urban population by total population.

The world urban population per capita for the year "1990" is saved in the dataframe with the column named as "UPC_1990".

2.1.2 World urban population per capita for the year 2000.

The world urban population per capita for the year "2000" is saved in the dataframe with the column named as "UPC_2000".

2.1.3 World urban population per capita for the year 2010.

The world urban population per capita for the year "2010" is saved in the dataframe with the column named as "UPC_2010".

Plotting choropleth maps for Task 2.1 Solutions

For year 1990, we are going to plot Choropleth map representing the world urban population per capita using both matplotlib and plotly and then analyse which maps looks more interactive and gives more information than the other.

Choropleth Map Using Matplotlib for 2.1.1 (Year 1990)

Choropleth Map using Plotly for 2.1.1 (Year 1990)

The plotly Python library is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, financial, geographic, scientific, and 3-dimensional use-cases.

Importing Plotly library

After analysing both the above plots of year 1990, we can say that matplotlib gives a basic plot whereas plotly gives an more interactive and attractive plot than matplotlib. Only a few lines of codes are necessary to create aesthetically pleasing, interactive plots with plotly and also saves time when exploring the plot by a mouse hover function which gives the information of every country.

For all the next tasks we are going to use plotly to plot the choropleth maps.

Choropleth Map using Plotly for 2.1.2 (Year 2000)
Choropleth Map using Plotly for 2.1.3 (Year 2010)